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Abstract. 

This paper analyzes a special instance of nonsymmetric algebraic matrix Riccati equations arising from trans- 
port theory. Traditional approaches for finding the minimal nonnegative solution of the matrix Riccati equations 
are based on the fixed point iteration and the speed of the convergence is linear. Relying on simultaneously 
matrix computation, a structure-preserving doubling algorithm (SDA) with quadratic convergence is designed for 
improving the speed of convergence. The difficulty is that the double algorithm with quadratic convergence can- 
not guarantee to work all the time. Our main trust in this work is to show that applied with a suitable shifted 
technique, the SDA is guaranteed to converge quadratically with no breakdown. Also, we modify the conventional 
simple iteration algorithm in the critical case to dramatically improve the speed of convergence. Numerical exper- 
iments strongly suggest that the total number of computational steps can be significantly reduced via the shifting 
procedure. 

Keywords, nonsymmetric algebraic Riccati equation, transport theory, shifting technique, 
critical case, structured doubling algorithm, simple iteration method 
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1. Introduction. The nonsymmeric algebraic Riccati equation (NARE), encountered in 
transport theory, is given by 

XCX - XD - AX + B = 0, (1.1) 
where A, B, C and D e E nx " are given by 

A = A-eq T , B — ee T , C = qq T D = V - qe T . (1.2) 



where 



e 



[i,...,i] T e 



q = [qi,...,q n ] , with q% = ^, 

A = diag([ft,...,ftj]), with ft = ChJ . (1+a) ; 
r = diag([di,...,d„]), with di = c ^({_ a) 



(1.3) 



The parameters, used to define the above matrices and vectors, satisfy 0<c<l, 0<a<l and 

n 

the sequences are < u n < • • • < 0J2 < cji < 1, Cj > 0, i = 1, 2, . . . , n, so that c i ~ 1- 

i=l 

For the physical meaning of the NARE (JTTTJ) and its corresponding parameters setup, the 
reader is referred to [13]. Correspondingly, we define the corresponding dual equation of (|1.1[) 

YBY -YA- DY + C = 0. (1.4) 

To facilitate our discussion, we need a nonsingular M-matrix or a singular irreducible M-matrix 
given by 

M=\ D B (1.5) 
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and its relative matrix 

H = JM, (1.6) 

where J = diag(/„, —I n ) with /„ to be the n by n identity matrix. Our interest in this study 
is to find the minimal nonnegative solution X of The existence conditions of the minimal 

nonncgative solution are shown by Juang et al. in |13j . Iterative methods for solving this problem 
are numerous and can be divided into two major categories. 

One is the method sharing a computational cost of 0(n 2 ) arithmetic operations (ops) per 
step, but converges linear or sublinear. The representative method of the first category is the 
simple iteration method (SI) or vector iteration method, which is first proposed by Lu [15J . This 
method is very simple and requires a computational cost of 4n 2 ops per step. Recently, three 
more methods, modified simple iteration (MSI), nonlinear block Jacobi method (NBJ) and the 
nonlinear block Gass-Seidel method (NBGS), based on Lu's method are proposed in [3 [5]. It has 
been shown in 10 that if (a,c) ^ (0,1), the speed of convergence of the NBGS is faster than 
the other three. Generally speaking, the iterative methods mentioned above can be classified as 
accelerated variants of the well-known fixed-point iterations. Also, in [TU] we know that all these 
four methods can provide a linear convergence, if (a, c) ^ (0, 1) and a sublinear convergence, if 
(a,c) = (0,l). 

The other is a method with a cost of 0(n 3 ) ops but provides quadratic convergence. Despite 
of the complexity, quadratically convergent methods are much to be desired in practice. There 
are several good algorithms that can cause quadratic convergence, for example, the Newton 
method [HIS] and the structure-preserving doubling algorithm (SDA) [THIS]- However, when 
(a, c) = (0,1), both Newton method and the SDA algorithm are not always valid and require 
special attention. 

In this work we fine-tune the customary SDA algorithm and make it always workable and 
quadratical convergent when solving (ll.lj) . The SDA algorithm was first proposed by Guo 
et.al. [11] for solving the NARE. In [11] [5], it has been shown to be quadratically convergent, 
if (a, c) ^ (0,1) and linearly convergent with rate 1/2, if (a,c) = (0,1). The later case is the 
so-called "critical case" and is the most challenging problem that we will encounter when solv- 
ing (jl.ip . Roughly speaking, the critical case embedded with some type of singularity, i.e., the 
matrix H has two zero eigenvalues, that will significantly reduce the speed of convergence. In [Sj , 
Guo et al. propose an efficient method based on a single-shift technique to accelerate the com- 
putation of the minimal nonnegative solution so that one singularity can be removed. They also 
show that the doubling algorithm applied to the shifted equation of (jl.l[) converges faster than 
the doubling algorithm applied to (|1.1[) . if no breakdown occurs. The approach of removing two 
zero eigenvalues of H has also been introduced in [5], but again the convergence of the doubling 
algorithm cannot be guaranteed. Our contribution in this paper, which we think is new in theory, 
is to provide a detailed analysis of changes in the eigenvalue distribution of matrices H and M 
as the shift procedures are employed. Through this discussion, the quadratic convergence of the 
SDA is guaranteed via the duble-shift technique to remove two singularities. Most important of 
all, the minimal nonnegative solution of the duble-shift model is shown to be equal to that of the 
original model. We believe such results are the first detailed proofs of the eigenvalue analysis of H 
and M and their corresponding matrices with shift procedures and should be of great significance 
for solving the NARE. 

The organization of this paper is as follows. In Section 2, we review some of the main results 
and definitions that will be used for subsequent discussion. In Section 3, we provide a complete 
discussion on the shifted modifications for the SDA algorithm. We show that the SDA algorithm 
applied to the double-shift problem is always accessible and the solution obtained from the double- 
shift problem is equal to the original NARE problem. In Section 4, advantages of the shifting 
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technique applied to the SI algorithm have been thoroughly investigated. In Section 5, we present 
a few numerical experiments to show the practicability and effectiveness of the shifting procedure 
and concluding remarks are given in Section 6. 

2. Preliminaries. In this section we briefly review the definitions of Z-matrix and M-matrix 
and discuss further some of their properties which are required in the statements and in the 
proofs discussed in the following sections. We also summarize the popular algorithm, SDA, for 
our numerical experiments as we shall see below. 

2.1. Definition and Theorems. In order to formalize our discussion, we start by intro- 
ducing the following two definitions. 

Definition 2.1. A matrix A = {a t j) e R nx ™ is called a Z-matrix if a i3 < for all i ^ j. 
Note that for any Z-matrix A, there exists a matrix B € R nx ™ with B > and some a E K 
such that A — al — B where / is the identity matrix. Also, the definition of Z-matrix plays an 
important role in defining a given matrix to be an M-matrix. 

Definition 2.2. A Z-matrix A is called an M-matrix if A = al — B with B > and 
a > p{B); where p{B) is the spectral radius of B. It is called a singular M-matrix if a = p{B) 
and a nonsingular M-matrix if a > p(B) . 

There are a great many different conditions, which are mathematically intriguing and im- 
portant for applications, that discuss the necessary and sufficient conditions for a given Z-matrix 
to be an M-matrix. For our subsequent discussions, we apply the following two well known and 
useful results in the study of M-matrices. 

Theorem 2.3. JSjj If A G R™ x " is a Z-matrix, the following statements are equivalent: 

1. A is a nonsingular M-matrix. 

2. a(A) c C+. 

3. Av > holds for some positive vector v £ IR ra . 
I A' 1 > 0. 

Theorem 2.4. ' t 9'l If the matrix (|1.5JI is a nonsingular M-matrix, then the NARE and 
its dual equation (|1.4|) have minimal nonnegative solutions X and Y , respectively. Also, matrices 
D — CX and A — BY are nonsingular M-matrix. 

Note that the conditions we list here are only a selection from many more useful ones. See 
[3j [9j [121 E] for a longer list of conditions and references to the proofs. 

2.2. SDA Algorithm. In [TT], Guo et al. come up with the SDA algorithm for solving 
NARE problems and show that if the matrix M (| 1 . 5[) is a nonsingular M-matrix (irreducible 
singular M-matrix [S]), the SDA algorithm is well-defined and quadratically convergent (at least 
linearly convergent with rate 1/2). Its idea is based on the doubling transformation. For more 
details of the doubling transformation, the reader is referred to (TT] Theorem 2.1]. The SDA 
algorithm starts by choosing a positive scalar 7 with 



7 > max < max au . max di 

[l<i<n l<i<n 

Let 

E Q = I n -2 1 V-\ F Q = I n - 2jW-\ 
G = 2 1 D- 1 CW-\ H Q = 2 1 W~ 1 BD-\ 

where 

A 1 = A + 1 I n , D 1 = D + 1 I n , 

W 1 = Ay - BB- X C, V 1 = D 1 - CA- X B. 
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Then, the SDA algorithm presented in [TT] is given by 

Ek+i = Ef.(I n — GkHk) 1 Ek, 
Fk+i — Fk(I n — HkGk) 1 Fk, 
Gk+i = Gk + E k {I n — GkH k ) 1 GkF k , 
Hk+i = Hk+ Fk(I n — HkGk) 1 HkEk, 



(2.3a) 
(2.3b) 
(2.3c) 
(2.3d) 



where the sequence Hk and Gk will converge to the minimal nonnegative solutions X of 
and Y of (| 1 .4[) quadratically. 

2.3. Spectrum Analysis. Recall that in the critical case (a, c) = (0, 1), the matrix M (|1.5p 
is an irreducible singular M-matrix [B] and the corresponding matrix H (|1.6[) has a double zero 
eigenvalue with the geometric multiplicity equal to one. To be specific, the matrix H has 2n real 
eigenvalues v n ,..., i>±, Ai, A„, which satisfy the following order [13] : 



-1 -1 -1-1 ,1,1 

— < v n < <...< — <v 2 < — <^i=0 = Ai< — <A 2 < — < . 

U>2 Wi UJ\ LO2 



The phenomenon is called eigenvalue interlacing. Moreover, 

a(D-CX) = {A 2 ,...,A„,0}, 
a(A-BY) = {0,-/ii,...,-/vh 



,<A„< — . 



(2.4) 



(2.5a) 
(2.5b) 



if X and Y are the minimal nonnegative solutions of (jl.l[) and (|1.4[) , respectively [5. Parallel- 
ing the above distribution, the following theorem shows that all eigenvalues of M are real and 
nonnegative. In fact, M has n specific eigenvalues for i = 1, . . . , n. 

Theorem 2.5. Let M be the matrix defined in (|1.5p with (a, c) = (0,1). Then M has 2n 
real eigenvalues, where one part of the eigenvalues of M are 0, ^,...,A- and the others are 
Hi,... , Hn—i such that the eigenvalues can be arranged in the following order: 

1 1 1 

< < fll < < M2 < • • • < Mn-l < ■ 

LJl UJ 2 UJ n 



Proof. Consider the characteristic polynomial of M defined by 

Tr - XIn 



f(X) = det(M - XI n ) = det( 
"(r - XI n ) 



A - XIn 



det( 

n ^ 

l<i<n 



(A - XI n 

X)(6i-X)(l- 



)(1- 



(r - A/„r 1 



(A - A7„ 
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e 



9i 



1j 



7j 



-A 



Sj-X 



)) 



(2.6) 



l<j<n 

The last equation (|2.6p is called the secular equation of M — A/. Notice that 7, = <5j = -j-, 
<7i = for 1 < i < n when (a, c) — (0, 1). Thus, through a straightforward calculation, we have 



/(A) 



n £ a 



KKn 



i<i<" 



II 1 ^ E 3 



Cj-A 



l<j<n 



-i-A 



l<i<n ! \l<i<" k=ij,l<k<n 
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Thus, / has roots 0,^-, ■ • • , To complete the proof of the theorem, let 

5(A) = e <* n (^- a )- 



l<j<n k^j,l<k<n K 



The sign of 3(^7-) is (— l)-* 1 since the monotonicity of {oJj} , the intermediate value theorem 
indicates that g has at least roots in ( — , —^—) for 1 < j < n — 1. Together with the fact that 
the degree of g is n — 1. The proof of the theorem is thus complete. □ 
It should be noted that 



H 



'In 




In 


X 




X 



(D - CX). (2.7) 



From the above theorem and (|2.7p , we know that the minimal nonnegative solution X is related to 
an invariant subspace with nonnegative eigenvalues of H . Also, it is clear that q T r -1 e+e T A~ 1 g = 
c = 1. We then have the fact [4] that the matrix H has a right eigenvector v T = [vj ,vj], with 
v\ = r _1 q and V2 = A _1 e, so that 

Hv = 0. (2.8) 

By applying this right eigenvector v, a left eigenvector u T — [uj,uj], with u\ — T^e and 
1x2 = —A~ 1 q of H, corresponding to the eigenvalue 0, can be obtained without any trouble by 
directly checking that 

u T H = 0. (2.9) 

Corresponding to the matrix H, the matrix M = JH has the right and left eigenvectors v and 
u T J. Also, it can be seen that ujvi + ujv2 — 0. Applying the eigenpair information, we have 
the following important result given in [BJ |5] . 

Theorem 2.6. Let M be an irreducible singular M-matrix as defined in (|1.5[) . and let 
X and Y are the minimal nonnegative solutions of and (|1.4[) . respectively. Suppose that 

corresponding to the zero eigenvalue, the right and left eigenvectors of M are v T = [vj ,vj] and 
u T = [uj , — uj ]■ If (ct,c) — (0, 1), then the following properties are satisfied: 

Xvi — i>2, uj X = —uj, and Yv2 — v%. (2-10) 



It was shown in [8], that the matrix X is the minimal nonnegative solution of 11.11 if and only 
if X T is the minimal nonnegative solution of the equation 



X T C T X 1 — X 1 A 1 — D' X ' + B ' =0. (2.11) 



The same statement can be applied to the dual equation (|1.4[) . Its proof is simply based on taking 
the transpose on both sides of (|l.ip . 

Corollary 2.7. The matrix Y is the minimal nonnegative solution of (|1.4[) if and only if 
Y T is the minimal nonnegative solution of the equation 

Y T B T Y T -Y T D T - A T Y T + C T = 0. (2.12) 



Following Corollary (|2.7p , we want to know that whether there exists a relationship between 
the left eigenvector of M and the minimal nonnegative solution Y. To begin with, let 



M t = 



D 1 
-C 1 



-B 1 



(2.13) 



() 



be the corresponding M- matrix of (|2.12p . Note that M t has a right eigenvector [uj , — uJ] T and 
a left eigenvector [uj , uj] corresponding to the eigenvalue 0. Equipped with the notations given 
in (|1.2[) . the matrix M t is again an irreducible singular M-matrix if (a, c) = (0, 1). Then, Theo- 
rem [2J3 asserts that Y T ui = — u^. Namely, we have derived the following important relationship 
between the left eigenvector u and the minimal solution Y, 



Y = 



T 



(2.14) 



On the other hand, we know that the convergence rate of the SDA algorithm is determined 



by 



limsup 2 y/\\H k -X\\ < p(C 7 (D - CX))p(C 7 {A - BY)), 



k— >oo 



where 



limsup 2 y/\\G k -Y\\ < p{C- t {D - CX))p(C 7 {A - BY)), 



Cry '. Z 
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(2.15a) 
(2.15b) 

(2.16) 



is the Cayley transform and the scalar 7 > [8]. Note that from (|2.5p . we have p(C 1 (D — CX) = 
p(Cy(A — BY)) = 1. It follows that no further conclusion of the convergence rate of the SDA 
algorithm can be derived from the fact (|2.15j) except that the linear convergence is guaranteed. 
In the subsequent section, we want to know that how the shift procedures affect the convergence 
rate. 

3. Properties of the Shifted NARE. In this section, a detailed analysis of the eigenvalue 
distribution of the matrix M is provided with respect to the the critical case (a, c) — (0, 1). It 
is shown that under the shifting technique, the matrix M is still an M-matrix and the SDA 
algorithm is guaranteed to converge. The minimal nonnegative solution in the shifted NARE 
problems are proved to be equal to the minimal nonnegative solution of (|1.1|) . Last but not least, 
the SDA algorithm is shown to be accelerated by removing the singularities embedded in the 
matrix H . 

3.1. Single Shift. Let H be the rank-one modification of the matrix H which is defined by 



H = H 



rjvr 



where 77 > is a scalar and r > is a vector satsifying 



(3.1) 

= 1. To be specific, we write 



r i j r 2 ]) where r\ — e, — q. Then, two matrices H and M are denoted by 



H = 



D -C 
B —A 



M = 



D 
-B 



-C 
A 



(3.2) 



where 



D = D + r]Vir 1 , 



C = C- riv ir < , 
B + rjV2rl , A = A — r/V2rJ 



(3.3) 



It follows from the specific structure of M given in (|3.2j) that the matrix M is irreducible. 
The nice feature of this rank-one modification is that one zero eigenvalue of H will be replaced by 
the scalar 77 > 0. This can be seen by directly applying the following useful lemma shown in [S]. 
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Lemma 3.1. Let T be a singular matrix with Tv = for some nonzero vector v. If r is a 
vector so that r T v = 1, then for any scalar r, the eigenvalues of the matrix 



T = T 



rjvr 



consist of those of T , except that one zero eigenvalue ofT is replaced by r\. 

It can be seen that from Lemma 13.11 the eigenvalues of H and H are the same except that 
one zero eigenvalue is shifted to rj. In the next theorem, we want to show that despite of the rank 
one modification, the eigenvalues of M^are equal to those of M. 

Corollary 3.2. Let M and M be defined in (|1.5I) and (|3.2p , respectively. Then, the 
characteristic polynomials of M and M are conincident. That is, the eigenvalues of M and M 
are equal. 

Proof. This proof can be easily obtainedjjy studying the characteristic polynomial of M. We 
know that the characteristic polynomial of M, denoted by /(A), is defined by 



L - XL, 



/(A) = det(M - XI 2 n) = det( 

n 

= JJ(±.-A) 2 dct(l+[e T g T ] 

n 1 n » 

= -IK~-*) a £-T* A 



A - AI„ 
(L - A/„)- ] 



(-/„ + 7 ? r- 1 ) (Z ' 

(-I n -riA- l )e 



[e T 9 T ]) 



(a - xi n y 



(-L„ 

(-In 



3 = 1 wj 



A 



(3.4) 



From (13.41) . we know that the eigenvalues of M are precisely those of M. □ 

Theorem 3.3. The matrix M defined by equation (|3.2I) is a Z-matrix if and only if the 
parameter rj, defined in (|3.9[) satisfy 



1 

< 77 < — . 

UJi 



(3.5) 



Proof. From p. 21) . M is a Z-matrix if and only if B > 0, C > 0, and D and A are Z-matrices. 
Note that 



D = r + (-/„+r / r- 1 ) g e T , C = (L n - r,T- 1 )qq T , 

B = {L + nA- 1 )ee T > 0, A = A +(-/„- 7?A- 1 )eg T . 



(3.6) 



The sufficient and necessary condition such that the matrix M is a Z-matrix is that C > 0, and 
D and ^4 are Z-matrices. This implies that 



- 1 + Wi < o. 



(3.7) 



Since rj is positive, we have the fact that M is a Z-matrix if and only if (|3.5p is satisfied. □ 

Using Corollary 13.21 and the given constraint (|3.5p in Theorem 13. 3[ we know that M is an 
irreducible M-matrix and the SDA algorithm is guaranteed to be applicable. It is known that 
the minimal nonnegative solution X of the single shifted NARE is equivalent to the minimal 
nonnegative solution X of (jl.ip (5). Thus, we have 



limsup 2 y/\\H k -X\\ < p(C 7 (D - CX))p(C^{A - BY)) < 1, 

k— >oo 



(3.8) 



since p(C 7 (D — CX) < 1 and p(C^(A — BY)) = 1. It concludes that the convergence of the SDA 
algorithm with a single shift is faster than that with no shift. Based on all the properties stated 
above, it is illuminating to begin the analysis of the double shifting technique. 



s 



3.2. Double Shift. In order to remove all zero eigenvalues of H, we define the double shifted 
matrix H, 



H = H + rjvr 1 + £su 1 = 



D -C 
B —A 



(3.9) 



where r\ > 0, £ < 0, p T and q T such that p T v = q T u = 1, each size of sub-matrices A, B, C and 
D are n square. This is the so called double shifting technique. Indeed, it can be seen that if we 
choose s T = [sj , sj ] with si = q and s 2 = — e and the same vectors r, m and w as defined above, 
then the vectors p and q satisfy the fact that 



r T v = s T u = e T r 1 q 



q T A~ 1 e = 1. 



(3.10) 



From Lemma |3. 11 we know that the double shifting technique will move one zero eigenvalue of H 
to rj > and the other to £ < and keep the nonzero eigenvalues unchanged. With this in mind, 
the shift technique introduced in formula (|3.9p will make the new matrix H nonsingular. Also, 
we can define a duble shifted NARE in X 6 R™ xn associate with the matrix H as follows: 



XCX - X D - AX + B = 0, 
and the dual duble shifted NARE inFeK nxn 

YBY -YA-DY + C = 0, 



where 



D 
B 



D 
B 



i]Vir{ 
nv 2 rj 



C 
A-- 



C 



A - rjv 2 rj - £,s 2 uj. 



(3.11a) 
(3.11b) 

(3.12) 



In what follows, we show that under suitable assumptions on parameters n and £, the matrix 
M defined by 



M 



D_ -C 
-B A 



(3.13) 



is a nonsingular M-matrix, that is, the SDA algorithm is well-defined and applicable to the 
NARE (|3.11ap . We start our proof by showing that this matrix M is a Z-matrix for some 
parameters rj and £. 

Theorem 3.4. The matrix M defined by equation (I3.13|) is a Z-matrix if and only if the 
parameters, rj and £, defined in (|3.9p satisfy the following two conditions: 



< X] < — , 



-1 + 



< £ < 0. 



(3.14a) 
(3.14b) 



Proof. It follows from (13. 13)) we know that M is a Z-matrix if and only if B > 0, C > 0, and 
D and A are Z-matrices. Also, from p,12j) we have 

d = r + (-/„ + 7 1 r- 1 )qe T + e^r- 1 , 

~C=(I n -7 1 T- 1 )qq T + S;qq T A-\ 
B = (I + riA- 1 )ee T - £ee T r^ 1 > 0, 
A = A + (-/„ - rjA'^eq 1 - ^eq T A~ 1 . 
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Therefore, in order to get a Z-matrix M, we only need to consider when C > 0, and D and A are 
Z-matrices. This gives rise to the following three sufficient and necessary conditions: 



-1 + r/uii + £uj n < 0, 

-l + Tywi-^wi < 0, (3.15) 
-1 — r]u> n — £u>i < 0. 



It follows from (|3. 151) and the initial conditions r\ > and £ < that M is a Z-matrix if and 
only if (|3.14a[) and (|3.14b|) are satisfied. □ 

To simplify our discussion, we define 

O = {(7 7 ,£);0<7 ? < — , Zl±^ <g<0}. (3.16) 



Our next approach is to show that the matrix M is indeed an M-matrix. That is, the iterative 
processes in SDA algorithm do not break down and convergence quadratically. To begin with, we 
introduce the following two lemmas. 

Lemma 3.5. Let a and u>i, for i = 1, . . . ,n, be defined in (jl.ip . Given A 6 K and A ^ 
for i = 1, . . . , n, we define 



.9i(A) = A£ X ^, ^(AHE^p 9s(X)=J2—£--. (3.17) 

i=l un i=l uii j=l 8 Vw» ' 

Then, the following properties hold: 

n 

1. gi(X) - X 2 g 2 (X) = XJ2 c l u l . 

2. « ?1 (A)- 53 (A) = -l/" 1 

^/Ae (£, j^), tten 53 (A) > 9l (\)£- > g 2 (\)±-. 

Proof. The first two properties are following from the direct computation. To see this, 
applying the conditions in (|3.17[) . we have 



5i(A)-A 2 32 (A) = A^ 



i=l 



(cj - XcjbJi)ui 



1=1 



ffi(A)- gs(A)=^ =-1. 
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Using the triangle inequality and A £ (^-, ^ ), we obtain 



53( A ) = E ,. f i' V) 
i=i ~ A - ) 

A; n 



> 5i(A) - 



> 



1 

n 



f-r (^-a) v (-^ — a) y 



> 32(A)— . 

□ 

We have now seen that the relationships among 31(A), (72(A), and 33(A). Let 3(A) to be a 
function satisfying 

.g(A) = A 5l (A)+7 7 e3 2 (A).g 3 (A), (3.18) 

where (j), £) G f2. Our next approach is to show that for each subinterval (— , Uk ± ) with k — 
1, . . . , n — 1, there exists a point A so that 3(A) > 0. This property is a stepping stone for showing 
that M is an M-matrix. 

Lemma 3.6. Let a and Ui, for i = l,...,n, be defined in ^TTTj) . It i/ien, follows that there 
exists a point Xk £ (jj-, 1 - ), /or fc = 1, . . . , n — 1, so that the function 3(A) of (|3.18j) is greater 
than zero. 

Proof. Note that 33(A) is a continuous function on (— , ), lim 33(A) = —00 , and 
lim 33(A) = +00, for all A; = 1, . . . , n — 1. Thus, there exists a point A& 6 (-j-, ) such 



4, .2 

3 3 (A fc ) = (3.19) 



that 



Since uj\ > ui2 > ■ ■ ■ > w„, we have the fact that 33 (A&) > 4. It follows from Lemma 13.51 that 
3i(A fc ) > 0. 

We first assume that 32(Afc) < for this specific A&, then it is clear that g(\k) = Afc3i(Afe) + 
? ?^32(Afc)33(Afc) > 0, since ry£ < 0. We now assume that 32(Afc) > 0. Combining the inequali- 
ties (|3.14a|) with (|3.14bl) . we have 



1 

Then, by (|3T8)) we get 

3(Afc) > 2^ 



2 < 77^ < 0. (3.20) 



^-^) + ^ ^A^grr) > 0> (3 21) 

i=l w< Afc Afc 
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since At. — — 

K bJ k UJ k + 1 

k + 1 < i < n. 
□ 



< 



0, for 1 < i < k, and A^ 



u k Uk+i 



0, for 



Now we have enough tools to validate that the given matrix M is indeed an M-matrix. In 
particular, we can also dig out the eigenvalue distribution of matrix M. 

Theorem 3.7. // (n, £) £ £1, then the matrix M defined by equation (|3 . 1 3|) is an M-matrix. 
In particular, M has 2n positive real eigenvalues Ai, . . . , \2n satisfying 



< Ai < A 2 < — < A 3 < A 4 < — < 

U>1 U>2 



< — < A 2 „_i < A 2 „ < — (3.22) 



Proof. Since the matrix H of (13. 9[) is nonsingular, it is clear that M = JH is nonsingular. 
Also, Theorem l3.4l implies M is a Z- matrix. In order to show that M is an M-matrix, it suffices to 
show that all eigenvalues of M have positive real part. Indeed, all eigenvalues of M are positive 
real numbers and satisfy the interlacing property. 

We first consider the characteristic polynomial /(A) of M defined by 



/(A) = det(M - A7 2 „) 
Tr - XI n 



A - \I„ 



dct( 

-n<^-A>^) 

i— 1 

n 1 n 

-s* n (--*>£ 



(-In-vA-^e 



£ e T r~ 



Cfc 



i— 1 l<s<n,S7^i 



fc=l 



(-A" + — 



A , 



n <i 

l<s<n,S7^fc 



A) 



(3.23) 
(3.24) 



where g(X) is the function given in (f3~T8l> . By direct substitution of -3- in (|3~2"4"]) . we have 
/(^) > 0, for A: = 1, . . . , n. Also, it follows from (I3.23[) that /(0) > 0. If we can find a point A 
satisfying /(w) < in each subinterval (^, ^ ), for k = 1, . . . , n and the interval (0, ^-), then 

the intermediate value theorem imply that the distribution of eigenvalues of M arranged in (|3.22[) 
is valid. This also gives rise to the fact that M is a nonsingular M-matrix. 

Next, we consider the subinterval (0, — ). Choosing A = it follows that 



1 n n n 



'2oji 2wi 



2=1 0Ji 2cji i—\ ^^u)i 2ui ) 

\ / . \- c^iK — ) -,! . \ - etui 



i=l uJi 2wi 
n 

i=2 



i=2 



i=2 



> 



(3.25) 
(3.26) 



The second inequality (|3.25[) comes from the fact that 77^ > 



1 



Also, since — < 1 and 



2ui 



< c, ; , for i = 2, . . . ,n, and ^ c 4 = 1, we have the last inequality (|3.26p . For the proof 

i=i 

of each subinterval (^-, ^ ), we simply apply the conclusion of Lemma 13.61 Then, (I3.23[) 
immediately implies that there exists a point A 6 ( — , — — ) such that /(A) < 0, for k — 1, . . . , n. 
□ 

Note that in [5] the minimal nonnegative solution X of (jl.lj) has been shown to be a solution 
of (|3.11a[) . So far, to the best of our knowledge, no study has investigated the relation between 
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the solutions X and X. If there does not exist any relation between X and X, the solution 
obtained from the duble-shift algorithm would be exclusively meaningless. Our next result is to 
find this substantial link through the known fact that M is indeed an M-matrix (|3.7I) . 

Theorem 3.8. Let X and X be the minimal nonnegative solutions of (|3.11al) and (II. ip . 
respectively. Then, o~(D — CX) = {77, A2, . . . , A„} and X — X . 

Proof. Let K{Z) = ZCZ -ZD-AZ + B and K{Z) = ZCZ - ZD -AZ + B. Observe first 

that 

U{X) = TZ(X) - n(X Vl - v 2 )(rjx + rj) + aX Sl - s 2 )(-ujx - uj) = Tl(X), (3.27) 

where the second equality follows directly from Theorem 12.61 This equality amounts to say 
that the minimal nonnegative solution of (jl.ip is also a nonnegative solution of (|3.11a[) and the 
following equality is satisfied. 

(D-CX). (3.28) 
Recall that uj + uj X = 0. Then, we have 



H 



In 




In 


X 




X 



(D - CX) =D-CX + r]vi(rj + rjx) + £si(uj + ujx) 
= D -CX + V v 1 (rJ +rJX). 

Together with the fact that 

(D - CX)V! = (T - qe T )T- 1 q - qq T A^e = 0, 

and 

(rj + rJX)vi = e T T-\ + q T A' 1 e = 1, 

we obtain 

(D-CX)v! = (D-CX)vi =tivi. 

Then, Lemma [3~T1 and Theorem 12.51 imply that a(D — CX) — {77, A2, . . . , A„}. 

Since M is a nonsingular M-matrix and X is the minimal nonnegative solution of (|3.11a[) . 
Theorem 12.31 and Theorem 12.41 imply that a(D — C X) C C+. With this in mind, we have 



a(D -C X) = a(D - CX). 



Note that 



H 



'In 




In 


X 




X 



(D-CX). 



(3.29) 



(3.30) 





In 




In 


span 


X 


= span 


X 



By (|3~29| and ([3^30]) . it is true that 



Then, there exists a nonsingular matrix S E W ixn such that 

S. 



'In 




'In 


X 




X 



13 



It is clear that this nonsingular matrix S is an identity matrix. So, we conclude that X = X. 
□ _ 

From Theorem 13.71 and Theorem 13. 81 we know that M is a nonsingular M-matrix. Then, the 
SDA algorithm is guaranteed to converge. Similar to the discussion given in the single shifted 
algorithm, we have 

limsup 2 {/\\H k -X\\ < p(C 1 (D — ~CX))p(C 1 (A - ~BY)) < 1, (3.31) 

k— >oo 

since p(C 1 (D -CX) < 1 and p(C 1 (A - BY)) < 1. This also implies that for any (ry,£) e fi, the 
SDA algorithm with double shifts converges faster than the SDA algorithm with no shift and is 
quadratically convergent. 

4. Advantages of the Shifting Technique Applied to SI. In [15 , Lu shows that the 
minimal nonnegative solution X of (|1.1[) must be of the form: 

X = T o (mn ) = (run 1 ) o T. 

Here, the symbol o is the Hadamard product, T = (iy) = > an d (m,n) is satisfying the 

vector equation: 

to = to o (Pn) + e, (4.1a) 
n = n o (Qm) + e, (4. lb) 

with 

p = <p * )= fe)' = (0 « )= fe)' (4 " 2) 

The SI method for finding the minimal nonnegative solution (to, n) is then given by 

m (fe+i) = m (fe) o ) + 6j (4. 3a ) 

n (*+i) = n (*) o (Qto^)) + e. (4.3b) 

Our aim in this section is to discuss how the shifted approaches can speed up the SI method. 
Theoretical discussion is also given to analyze the convergence of the SI method with shift. We 
then rewrite the coefficient matrices (13. 12)) as 



with 



Qi 

Ei 



£= ; D(r / ,e)=r-Q 1 ( ?7 ) J B 1 (C) T , 

c = c(r 1 ,o = Qi(v)Q2(0 T , 

B_=B(r ) ,0=E 2 (T ] )E 1 (Z) T , 

A = A(r h t)=&-E 2 ( V )Q 2 (t) T , 



Qx{ri) = [{In-riT- 1 )q q] , 

Ei(0 = [e -£r-M , 



Eo 



Qa(0 
E2(r,) 



q 'q 

{In 



?7A~ 1 )e 



(4.4a) 
(4.4b) 
(4.4c) 
(4.4d) 



and relax the boundary conditions (77, £) so that (17,^) € fi. Here, O is the closure of the set f2 
defined in (j3~T6l) . Substituting (|44)) into (|3.11ap . we have 



ZT + AZ = [ZQ X + E 2 )(QjZ + Ej) 



(4.5) 
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This implies that the minimal nonnegative solution Z of (|3.11al) can be written as 

Z = To(MN T ), (4.6) 

with M = ZQ t + E 2 e M" x2 , N T = QjZ + Ej e K 2xn . 

Akin to the iteration given in (j4.3[) . the iteration sequence {Mk,Nk} corresponding to (|4.6[) 
can be written as 



M k+ i = (T o (Af fe 7V fc T ))Q 1 + E 2 
N k+1 = (To{N k Ml))Q 2 +E 1 



(4.7a) 
(4.7b) 



with the initial value 



M = 0, N = 0. 



(4.7c) 



Let Zk(r),£) = Zk = T a (MkNj), for all k. Corresponding to (|4.5p . we then have the classical 
fixed-point iteration, 



z k+1 ee t o ((z fc Qi + s 2 )(gj Z fc + Ejj) 



(4.8) 



Our next theorem is to show that the sequence {Zk} does indeed converge and converge to 
the minimal nonnegative solution X of (jl.ip . 
Theorem 4.1. Assume that 



R(X*) = X*CX* - X*D - AX* + B < 0, (4.9) 

with initial value Zq = 0. 



for some nonnegative matrix X* . Then for the fixed-point iteration 
we have 



Zq < Z\ < ■ ■ ■ < Zk < X* , for any k > 1. 



(4.10) 



Moreover, lim Zk(r]^) = X for any (77, £) G 

fc— 7-OC 

Proo/. By flU}, > 0, QiQj = C > 0, S 2 ^7 = B > 0, and £; 2 Q 2 > 0. It follows 

that (|4.10j) holds by induction. Since the sequence {Zk} is monotonically increasing and bounded 
above, we have lim Zk = Z*, for some Z*. Hence R(Z*) = 0. On the other hand, since Z* < X* 

H— ¥oo 

for any nonnegative matrix X*, we have Z* = X. □ 

The convergence property, shown in Theorem 14.11 is of fundamental importance in our sub- 
sequence discussion and can induce the possibility of analyzing a number of convergent behaviors 
in the SI method with shift. Note that since Mk and Nk are matrices in IR" x2 , we can define 



Mk = 



Ak) 



N k = 



,( fe ) J k ) 



(4.11) 



where m[ k \ m 2 k \ and are n-dimension column vectors. It follows that we have the 



equivalent iteration for Zk, that is, 



Z k = T o (m[ k \n{ k) ) T + m {k \n {k) ) T ) 



(4.12) 
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Then, we obtain the new algorithm of the SI with shift, given by 



m 



(fc+l) _ 
1 

(k+l) _ 



''1 
(AH 
*2 



Z k {I n ~ rjT-^q + (I n + rjT-^e, 
Z k q + e, 
Zjq + e, 



with the initial value 



,(°) _ 



= 0, 7t4 0) = 



0, ,4 0) 



0. 



(4.13a) 
(4.13b) 
(4.13c) 
(4.13d) 

(4.14a) 
(4.14b) 



It is true that this SI iteration with shift is still a method with a cost of 0(n 2 ) ops but 
requires more calculations than the original SI method. However, in order to have a method with 
a better behavior, adding some complexity is sometimes a necessary sacrifice. Actually, we can 
simplify our computation by consider the following iteration, 



(fc+i) 



= \Z, 



In] 



m[ k+ V=mi k+ V+ V [Z k I n ] 



(k+l) _ 



(k+l) 



In 

tffi In] 



-A^q 

r- x e 



( fe ) ™W „( k ) J k ) 



In next theorem, we discuss the convergent property of the sequence (m\ 
and the convergent speed of the sequence Z k ■ 

Theorem 4.2. Given (a, c) = (0,1), the sequence (mf , m^' , , -n^' ) with initial val- 
ttes (|4.14p is strictly monotonically increasing and satisfies the following two conditions: 



a. e < mf < m, e < m^' < m, e < n^' < n, < n^' < —£T 



(k) 



lim rol^ = lim ml*' = m, 



lim 



.CO 



= n, lim ?7 

fc— ■ 



= 0, 



where m and n are defined on (|4.1[) . In /ac£, m f/ie critical case, we have m = n and X = X T 
Proof. From Theorem 14.11 we know that Z$ < Z\ < 

Substituting these two facts to (|4. 1 3[) . we immediately have 



< Z k < X and lim Z k = X. 

k— foo 



< 
< 
< 
> 



m 



(i) 



rn 
n 
n 



(i) 
l 

(i) 
i 

(i) 

2 



< 
< 
< 
> 



m 



in 



(2) 
2 

(2) 



4 2) 

,(2) 



< . 


. < 


(fc) 

m 2 


< 


< . 


. < 


(fc) 
m\ 


< 


< . 


. < 


(fc) 
n\ 


< 


> . 


. > 


(fc) 
n-2 


> 



Xq + e = m, 
Xq + e = m, 
X T q + e = n, 
-^(r- 1 e-X T A- 1 q) = 0. 



Note that the order of the sequence {m^} comes from the fact that Z k q — r/Z k T 1 q > and the 
last equality of the sequence {n 2 } comes from Theorem 12.61 
From (|4.12[) . part (b) holds, since lim Z k = X. □ 

fc— >oo 

When we studied the shifted procedures, our main purpose is to speed up the convergence. 
In what follows we discuss the relations of Z k (n,^) with respect to different r\ and £ values and 
show that the SI with shift converges linear, instead of sublinear. 

Theorem 4.3. Given (a, c) = (0, 1), the sequence {Z k } has the following two properties: 



1G 



a. Z k (0,0) < Z k (r],0) < Z k (v,0, for each k and fa.f) g SI. 

6. The sequence {Z k (r],^)} converges linearly to the minimal nonnegative solution X of 
HO) for all (rj,0 € SI. 
Proof. From (|4.6[) . we have 



Z = T o {ZQ 1 (i 1 )Q 2 (C) T Z + £ 2 (r?)Q 2 (0 T Z + ZQi(t?)Ei(0 T + E 2 ( V )E 1 (0 T ) 
= T o ((ZQi(0)g 2 (0) T Z + £ 2 (0)Q 2 (0) T Z + ZQi(0)Si(0) T + £ 2 (0)£i(0) T ) 
+ 77T o ((-Zl^g + A~ 1 e)(q T Z + e T )) - £T ° ((Zg + e)(-< Z T A- 1 Z + e^" 1 )) . 

Subsequently, it follows from mathematical induction that part (a) holds. 

For the proof of part (b), we need to use three well-known results discussed in [5]. First, for 
the iteration (|4.8[) and Zo(r],£) = 0, we have 



limsup y\\Z k (r),£)-X\\ 

= p {{I ® A + T ® I)- 1 [/ ® (^^7 + AC) + (Qi^7 + CX) ® J]) , (4.15) 

where ® denotes the Kronecker product (see [H Theorem 3.2]). Second, let Mj = 7® (A — XC) + 
(£) — CX) T ® J. Then, Mx is a Z-matrix since both A — XC and D — CX are Z-matrices. (see 
[5J Remark 1.1]). Also, Mx is a nonsingular matrix since any eigenvalue of Mj is the sum of an 
eigenvalue of A — XC and D — CX. This implies that Mx is a nonsingular M-matrix. Third, if 
Mx is a nonsingular M-matrix, then 

p {{I ® A + T g> I)" 1 [7 ® (^^ + XC) + (Qi£?7 + CX) ® /]) < 1, (4.16) 



that is, limsup y/\\Z k (r),£) - X\\ < 1. (see [5J Theorem 3.3]) 
□ 

5. Numerical Implementation and Comparisons. To illustrate the consequence of the 
previous sections, numerical experiments, consisting of SDA and SI methods after the shifting 
technique, are presented to demonstrate our conclusion. All computations are performed in 
MATLAB/version 2010b on a iMac with an 2.8GHZ Intel Core i5 processor and 16GB main 
memory, using IEEE double-precision. 

In the next implementations, the relative error for the SDA is defined by 



Err 



SDA 



\G k — Gfc-iHoo \\H k — Hk-i oo 



||G fe |U ' \\H k \ 
the relative error for the SI with no shift is defined by 



Errg/ = max 



| m (*) _ m (*-i)|| || n (fe) _„(*-!) I 



the relative error for the SI with the shifting procedure is defined by 

HMfc-Mfc-illc ||JVjfe — JV*_x||c 



Err 



SIS 



llMfcHoo ' ||JVfc||oo 

and the relative normalized residual is defined by 

|A fc r + AX k - (X k q + e)(q T X k + e T )\\ c 



Res = 



||Jffc||oo||r||oo + MoollAlU + (HAfelUllglU + ||e|| 00 )(|!g T || 00 ||A fe || 00 + ||e T || c 
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where Xk = Gu for the SDA algorithm, X). = T o (m^n^ ) for the SI algorithm with no shift 
and X/, = T o (M^N^) for the SI algorithm with shift. All iteration methods are terminated 
whenever the relative errors or the relative normalized residual residuals are less than n 2 e, where 
e = 2~ 52 = 2.22 • 1CT 16 be the machine zero. 

Example 5.1. In this example, we compare the methods for finding the minimal nonnegative 
solution of (jl.ip by using the shifting technique. We explain the efficiency of the SDA and SI 
applied to the shifted equations (|3.3[) and (|3.12p , respectively. We consider (jl.ip with (a,c) — 
(0,1). As suggested in JP[ | l^j , the constants Ci and uji are the nodes and weights, which are 
obtained by dividing the interval [0, 1] into n/4 subinterval of equal length and applying to each 
subinterval the A-node Gauss-Legendre quadrature. 

In tabled we report a comparison of residuals and the number of iterations for the SDA with 
no shift, the SDA with a single shift, the SDA with double shifts , the SI with no shift, the SI with 
a single shift, and the SI with double shifts and with size n — 32, 64, 128, and 256. From tabled 
we have the following two conclusions. 

First, in the critical case (a, c) = (0, 1), it is known that the SDA algorithm converges linearly. 
After applied to the shifted equation, the SDA algorithm converges quadratically. As shown in 
Table [5l the number of steps required in the SDA algorithm with a single shift or double shifts 
are around half of those of the SDA algorithm with no shift. Also, the computed solution of the 
shifted equations is more accurate than the one obtained with no shift. The numerical phenomena 
are in accordance with the theoretical discussion given in ffl. 

Second, we randomly choose r\ and £ from the set Q. Indeed, in Tabled we have (17, £) = 
(^-,0) for the single-shift problems and (?7, £) = {-^j-, 5^7) f or the double-shift problems. We 
see that even with 10000 steps, the solution obtained from the nonshifted problems can only have 
accuracy up to 10 -8 . On the other hand, the solution for the shifted problems can have the 
accuracy better than 10 -10 and a dramatical decrease in the number of steps required in the 
computation. Also, the iteration counts listed in Tableware in accord with Theorem \4-ty 



n 


SDA(no shift) 


SDA(single shift) 


SDA(double shifts) 


32 
64 
128 
256 


9.7e-14(27) 
4.2c-13(27) 
1.7e-12(27) 
6.8e-12(27) 


4.5e-15(ll) 
1.6e-14(12) 
4.2e-14(13) 
1.2e-13(14) 


7.4e-15(ll) 
1.9c-14(12) 
6.1e-14(13) 
1.4e-13(14) 


n 


SI(no shift) 


SI(single shift) 


SI(double shifts) 


32 
64 
128 
256 


* (>10000) 
*(>10000) 
*(>10000) 
*(>10000) 


2.4e-13(164) 
1.0-12(154) 
4.0-12(145) 
1.6-11(136) 


2.9e-13(40) 
1.3e-12(38) 
5.4e-12(36) 
2.2c-ll(34) 



Table 5.1 



Comparison of the residuals (and in parentheses the number of iterations) of the SDA and SI techniques. 



6. Conclusion. The challenging issues of applying the SDA algorithm to the shifted NARE 
problems are to develop a well-defined sequence, to guarantee the convergence of the sequence, 
and to associate the solutions of the shifted problems with the original one. All these issues 
related to the structued NARE Ijl.ljl have been studied in our work. Numerical experiments show 
the improvement of the speed and accuracy while applying the SDA algorithm to the shifted 
problems. Note that the bottleneck for applying this algorithm is to compute the inverses of 
(7„ — HkGk) and (J n — GkH^), which apparently have an 0(n 3 ) complexity. Compare with 
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the Newton method, which has been shown to have 0(n 2 ) complexity i4| , an interesting problem 
worthy of further investigation is to reduce the computational cost by taking the specific structure 
of (|1.1[) into account. 

On the other hand, while applying the SI algorithm to the critical case, its convergence is 
very slow and has almost stopped. Through the shifting technology, a new iteration method 
has been introduced and preserve the linear convergence. Numerical experiments show that 
while considering the SI algorithm, the convergence with double shifts is much faster than the 
convergence with a single shift or no shift. We believe the results we obtain are new in the field 
and could provide considerable insight into the NARE problems. 
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